Parallel Checkpointing on a Grid-Enabled Java Platform
نویسندگان
چکیده
This article describes the implementation of checkpointing and recovery services in a Java-based distributed platform. Our case study is suma, a distributed execution platform implemented on top of Grid services. suma has been designed for execution of Java bytecode, with additional support for parallel processing. suma middleware is built on top of commodity software and communication technologies, including Java, Corba, and Globus services. The implementation of suma that runs on top of Globus services is called suma/g.
منابع مشابه
The One-Click Grid-Resource Model
This paper introduces the One-Click Grid resource, which allows any computer with a Java enabled web browser to safely provide resources to Grid without any software installation. This represents a vast increase of the number of potential Grid resources that may be made available to help public interest research. While the model does make restrictions towards the application writer, the technol...
متن کاملInteractive Visualization of Grid Monitoring Data on Multiple Client Platforms
Most current Grid monitoring systems provide a visual user interface. With recent advances in multimedia capabilities in user terminals, there is a strong trend towards interactive, multi-modal and multi-platform visualization. In this paper we describe a multi-platform visualization architecture and a Web based service built upon it, which provides a view of the monitored Grid hierarchy, and t...
متن کاملDebugging MPI Grid Applications Using Net-dbx
Problem solving using grid computing environments has become very popular amongst research groups in computation-demanding fields. This is due to the ability of Grid technologies and middleware to enable large-scale resource sharing. Application-development in such environments is a challenging process, thus the need for grid enabled development tools is also one that has to be fulfilled. In ou...
متن کاملStability Assessment Metamorphic Approach (SAMA) for Effective Scheduling based on Fault Tolerance in Computational Grid
Grid Computing allows coordinated and controlled resource sharing and problem solving in multi-institutional, dynamic virtual organizations. Moreover, fault tolerance and task scheduling is an important issue for large scale computational grid because of its unreliable nature of grid resources. Commonly exploited techniques to realize fault tolerance is periodic Checkpointing that periodically ...
متن کاملA checkpointing-enabled and resource-aware Java Virtual Machine for efficient and robust e-Science applications in grid environments
Object-oriented programming languages presently are the dominant paradigm of application development (e. g., Java,. NET). Lately, increasingly more Java applications have long (or very long) execution times and manipulate large amounts of data/information, gaining relevance in fields related with e-Science (with Grid and Cloud computing). Significant examples include Chemistry, Computational Bi...
متن کامل